Encoding semantic relationships in literary texts

نویسندگان

چکیده

Encoding meaningful semantic relationships in literary texts is almost as difficult defining and identifying them. Defining the types components of that can be extracted from a quite challenging task because literature full implicit oblique messages references. Subsequently, encoding even more often relations do not have neither clear nor standard linguistic form usually they overlap each other. This paper discusses modeling issues concerning mapping cultural content humanities texts, highlighted by case ECARLE project annotation campaign. On handling these proposes methodology minimalistic flexible techniques, combined order to generate human annotated training data for Relation Extraction machine learning system. The proposed utilizes available TEI tagset, and, without any further customizations, allows formed named entities simple yet way, open reuse, interchange, conversion visualization.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Annotating Character Relationships in Literary Texts

We present a dataset of manually annotated relationships between characters in literary texts, in order to support the training and evaluation of automatic methods for relation type prediction in this domain (Makazhanov et al., 2014; Kokkinakis, 2013) and the broader computational analysis of literary character (Elson et al., 2010; Bamman et al., 2014; Vala et al., 2015; Flekova and Gurevych, 2...

متن کامل

Literary Figures in Gāthic Texts

Introduction Gāthic texts are a collection of religious songs of Zarothustra who lived about 1200 BC. Of the seventy two hāts (stanzas) of Yasna (one of the five chapters of Avesta), seventeen hāts belong to five Gāthas. These seventeen hāts have been classified into five categories based on their syllabic meter and the number of the song: 1) ahunavaiti, 2) ushtavaiti, 3)spanta.mainyu, ...

متن کامل

Annotating Similes in Literary Texts

Annotated corpora are invaluable resources for researchers in the humanities: on the one hand, for natural processing tasks, they can serve as standards against which results from new automatic methods can be measured; on the other hand, in corpus-based studies, they enable either to answer existing research questions or to explore original ones. In this respect, some annotation frameworks such...

متن کامل

Identifying Literary Texts with Bigrams

We study perceptions of literariness in a set of contemporary Dutch novels. Experiments with machine learning models show that it is possible to automatically distinguish novels that are seen as highly literary from those that are seen as less literary, using surprisingly simple textual features. The most discriminating features of our classification model indicate that genre might be a confoun...

متن کامل

Discovering Multilingual Text Reuse in Literary Texts

We present here a method for automatically discovering several classes of text reuse across different languages, from the most similar (translations) to the most oblique (literary allusions). Allusions are an important subclass of reuse because they involve the appropriation of isolated words and phrases within otherwise unrelated sentences, so that traditional methods of identifying reuse incl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Balisage Series on Markup Technologies

سال: 2021

ISSN: ['1947-2609']

DOI: https://doi.org/10.4242/balisagevol26.koidaki01